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Abstract. We study a market for private data in which a data ana- 
lyst pubhcly releases a statistic over a database of private information. 
Individuals that own the data incur a cost for their loss of privacy pro- 
portional to the differential privacy guarantee given by the analyst at 
the time of the release. The analyst incentivizes individuals by compen- 
sating them, giving rise to a privacy auction. Motivated by recommender 
systems, the statistic we consider is a linear predictor function with pub- 
licly known weights. The statistic can be viewed as a prediction of the 
unknown data of a new individual, based on the data of individuals 
in the database. We formalize the trade-off between privacy and accu- 
racy in this setting, and show that a simple class of estimates achieves 
an order-optimal trade-off. It thus suffices to focus on auction mecha- 
nisms that output such estimates. We use this observation to design a 
truthful, individually rational, proportional-purchase mechanism under 
a fixed budget constraint. We show that our mechanism is 5-approximate 
in terms of accuracy compared to the optimal mechanism, and that no 
truthful mechanism can achieve a 2 — e approximation, for any e > 0. 



1 Introduction 

Recommender systems are ubiquitous on the Internet, lying at the heart of 
some of the most popular Internet services, including Nctflix, Yahoo, and Ama- 
zon. These systems use algorithms to predict, e.g., a user's rating for a movie, 
her propensity to click on an advertisement or to purchase a product online. 
By design, such prediction algorithms rely on access to large training datasets, 
typically comprising data from thousands (often millions) of individuals. This 
large-scale collection of user data has raised serious privacy concerns among re- 
searchers and consumer advocacy groups. Privacy researchers have shown that 
access to seemingly non-sensitive data {e.g., movie ratings) can lead to the leak- 
age of potentially sensitive information when combined with de-anonymization 
techniques [l]. Moreover, a spate of recent lawsuits [2, -^,4] as well as behavioral 
studies [5] have demonstrated the increasing reluctance of the public to allow 
the unfettered collection and monetization of user data. 

As a result, researchers and advocacy groups have argued in favor of legisla- 
tion protecting individuals, by ensuring they can "opt-out" from data collection 
if they so desire [()]. However, a widespread restriction on data collection would 
be detrimental to profits of the above companies. One way to address this tension 



between the value of data and the users' need for privacy is through incentiviza- 
tion. In short, companies releasing an individual's data ought to appropriately 
compensate her for the violation of her privacy, thereby incentivizing her consent 
to the release. 

We study the issue of user incentivization through privacy auctions, as in- 
troduced by Ghosh and Roth [7] . In a privacy auction, a data analyst has access 
to a database d G M" of private data di, i = 1, . . . ,n, each corresponding to a 
different individual. This data may represent information that is to be protected, 
such as an individual's propensity to click on an ad or purchase a product, or the 
number of visits to a particular website. The analyst wishes to publicly release 
an estimate s(d) of a statistic s(d) evaluated over the database. In addition, 
each individual incurs a privacy cost Ci upon the release of the estimate s(d), 
and must be appropriately compensated by the analyst for this loss of utility. 
The analyst has a budget, which limits the total compensation paid out. As such, 
given a budget and a statistic s, the analyst must (a) solicit the costs of indi- 
viduals Ci and (b) determine the estimate s to release as well as the appropriate 
compensation to each individual. 

Ghosh and Roth employ differential privacy [8] as a principled approach to 
quantifying the privacy cost c^. Informally, ensuring that s(d) is e-differentially 
private with respect to individual i provides a guarantee on the privacy of this 
individual; a small e corresponds to better privacy since it guarantees that s(d) is 
essentially independent of the individual's data di. Privacy auctions incorporate 
this notion by assuming that each individual i incurs a cost q = Ci(e), that is a 
function of the privacy guarantee e provided by the analyst. 

1.1 Our Contribution 

Motivated by recommender systems, we focus in this paper on a scenario where 
the statistic s takes the form of a linear predictor: 

s(d) := {w, d) = X;r=i '^^d,, (1) 

where w G K", is a publicly known vector of real (possibly negative) weights. 
Intuitively, the public weights wt serve as measures of the similarity between 
each individual i and a new individual, outside the database. The function s(d) 
can then be interpreted as a prediction of the value d for this new individual. 

Linear predictors of the form (1) include many well-studied methods of sta- 
tistical inference, such as the fc-nearest-neighbor method, the Nadaranya- Watson 
weighted average, ridge regression, as well as support vector machines. We pro- 
vide a brief review of such methods in Section 5. Functions of the form (1) are 
thus of particular interest in the context of recommender systems [9,10], as well 
as other applications involving predictions {e.g., polling/surveys, marketing). In 
the sequel, we ignore the provenance of the public weights w, keeping in mind 
that any of these methods apply. Our contributions are as follows: 

1. Privacy- Accuracy Trade-ofT. We characterize the accuracy of the esti- 
mate s in terms of the distortion between the linear predictor s and s defined 
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as 6{s, s) := niaxdE [|s(d) — s(d)p], i.e., the maximum mean square error 
between s(d) and s(d) over all databases d. We define a privacy index /3{s) 
that captures the amount of privacy an estimator s provides to individuals 
in the database. We show that any estimator s with low distortion must also 
have a low privacy index (Theorem 1). 

2. Laplace Estimators Suffice. We show that a special class of Laplace es- 
timators i] {i.e., estimators that use noise drawn from a Laplace dis- 
tribution), which we call Discrete Canonical Laplace Estimator Functions 
(DCLEFs), exhibits an order-optimal trade-off between privacy and distor- 
tion (Theorem 2). This allows us to restrict our focus on privacy auctions 
that output DCLEFs as estimators of the linear predictor s. 

3. Truthful, 5-approximate Mechanism, and Lower bound. We design a 
truthful, individually rational, and budget feasible mechanism that outputs a 
DCLEF as an estimator of the linear predictor (Theorem 3). Our estimator's 
accuracy is a 5-approximation with respect to the DCLEF output by an 
optimal, individually rational, budget feasible mechanism. We also prove 
a lower bound (Theorem 4): there is no truthful DCLEF mechanism that 
achieves an approximation ratio 2 — e, for any e > 0. 

In our analysis, we exploit the fact that when s is a Laplace estimator mini- 
mizing distortion under a budget resembles the knapsack problem. As a result, 
the problem of designing a privacy auction that outputs a DCLEF s is similar 
in spirit to the knapsack auction mechanism [12]. However, our setting poses 
an additional challenge because the privacy costs exhibit externalities: the cost 
incurred by an individual is a function of which other individuals are being com- 
pensated. Despite the externalities in costs, we achieve the same approximation 
as the one known for the knapsack auction mechanism [12]. 

1.2 Related Work 

Privacy of behavioral data. Differentially-private algorithms have been de- 
veloped for the release of several different kinds of online user behavioral data 
such as click-through rates and search-query frequencies [13], as well as movie 
ratings [14]. As pointed out by McSherry and Mironov [14], the reason why the 
release of such data constitutes a privacy violation is not necessarily that, e.g., 
individuals perceive it as embarrassing, but that it renders them susceptible to 
linkage and de-anonymization attacks [1]. Such linkages could allow, for example, 
an attacker to piece together an individual's address stored in one database with 
his credit card number or social security number stored in another database. It 
is therefore natural to attribute a loss of utility to the disclosure of such data. 

Privacy auctions. Quantifying the cost of privacy loss allows one to study 
privacy in the context of an economic transaction. Ghosh and Roth initiate this 
study of privacy auctions in the setting where the data is binary and the statistic 
reported is the sum of bits, i.e., di G {0, 1} and u>i = 1 for alH = 1, . . . , n [7]. 
Unfortunately, the Ghosh-Roth auction mechanism cannot be readily general- 
ized to asymmetric statistics such as (1), which, as discussed in Section 5, have 
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numerous important applications including recommender systems. Our Theo- 
rems 1 and 2, which parallel the characterization of order-optimal estimators 
in [7], imply that to produce an accurate estimate of s, the estimator s must 
provide different privacy guarantees to different individuals. This is in contrast 
to the multi-unit procurement auction of [7]. In fact, as discussed the introduc- 
tion, a privacy auction outputting a DCLEF s(d) has many similarities with a 
knapsack auction mechanism [12], with the additional challenge of externalities 
introduced by the Laplacian noise (see also Section 4). 

Privacy and truthfulness in mechanism design. A series of interesting 
results follow an orthogonal direction, namely, on the connection between pri- 
vacy and truthfulness when individuals have the ability to misreport their data. 
Starting with the work of McSherry and Talwar [!•")] followed by Nissim et al [IG], 
Xiao [17] and most recently Chen et al [Ls], these papers design mechanisms that 
are simultaneously truthful and privacy-preserving (using differential privacy or 
other closely related definitions of privacy). As pointed out by Xiao [17], all 
these papers consider an unverified database, i.e., the mechanism designer can- 
not verify the data reported by individuals and therefore must incentivize them 
to report truthfully. Recent work on truthfully eliciting private data through a 
survey [1!),lHi] also fall under the unverified database setting [17]. In contrast, 
our setting, as well as that of Ghosh and Roth, is that of a verified database, 
in which individuals cannot lie about their data. This setting is particularly rel- 
evant to the context of online behavioral data: information on clicks, websites 
visited and products purchased is collected and stored in real-time and cannot 
be retracted after the fact. 

Correlation between privacy costs and data values. An implicit as- 
sumption in privacy auctions as introduced in [7] is that the privacy costs are 
not correlated with the data values di. This might not be true if, e.g., the data 
represents the propensity of an individual to contract a disease. Ghosh and Roth 
[7] show that when the privacy costs are correlated to the data no individually 
rational direct revelation mechanism can simultaneously achieve non-trivial ac- 
curacy and differential privacy. As discussed in the beginning of this section, the 
privacy cost of the release of behavioral data is predominantly due to the risk 
of a linkage attack. It is reasonable in many cases to assume that this risk (and 
hence the cost of privacy loss) is not correlated to, e.g., the user's movie ratings. 
Nevertheless, due to its importance in other settings such as medical data, more 
recent privacy auction models aim at handling such correlation [19,20,21]; we 
leave generalizing our results to such privacy auction models as future work. 

2 Preliminaries 

Let [k] = {1, • • ■ , k}, for any integer fc > 0, and define I := [i?min, -Rmax] C M to 
be a bounded real interval. Consider a database containing the information of 
n > individuals. In particular, the database comprises a vector d, whose entries 
S I, i £ [n], represent the private information of individual i. Each entry 
di is a priori known to the database administrator, and therefore individuals 
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do not have the abihty to he about their private data. A data analyst with 
access to the database would like to publicly release an estimate of the statistic 
s(d) of the form (1), i.e. s(d) = X^isH '"'i'^^' ''^^'^ some publicly known weight 
vector w = {wi, . . . ,Wn) G M". For any subset H C [n], we define w{H) := 
X^igij l^il, and denote by W := = X]r=i I'"'*! norm of vector w. 

We denote the length of interval I by Z\ := i?max — ^min, and its midpoint by 
R := (i?min + -Rmax)/2. Without loss of generality, we assume that Wi ^ for 
all i G [n]: if not, since entries for which = do not contribute to the linear 
predictor, it suffices to consider the entries of d for which Wi ^ 0. 



2.1 DiflFerential Privacy and Distortion 

Similar to [7], we use the following generalized definition of differential privacy: 

Definition 1 (Differential Privacy). A (randomized) function f : I" — > 
is {ei, . . . , en) -differentially private if for each individual i G [n] and for any pair 
of data vectors d, d^'^ G I" differing in only their i-th entry, is the smallest 
value such that P[/(d) G 5] < e''P[/(d(*)) G S] for all S C M™. 

This definition differs slightly from the usual definition of e-differential pri- 
vacy [11], as the latter is stated in terms of the worst case privacy across all 
individuals. More specifically, according to the notation in [] I], an (ei, . . . , e„)- 
differentially private function is e-diffcrentially private, where e = max^ e^. 

Given a deterministic function /, a well-known method to provide e-differential 
privacy is to add random noise drawn from a Laplace distribution to this function 
[11]. This readily extends to (ei, . . . , en)-diffcrential privacy. 

Lemma 1 ([11]) Consider a deterministic function / : I" — > M. Define /(d) := 
/(d) -|- Lapia), where Lap(a) is a random variable sampled from the Laplace 
distribution with parameter a . Then, / is (ei, . . . , en) -differentially private, where 
= Si{f)/a, and S^if) maxd^d(')6i" l/(d) - /(d''^)], is the sensitivity of f 
to the i-th entry di, i £ [n]. 

Intuitively, the higher the variance a of the Laplace noise added to /, the smaller 
ei, and hence, the better the privacy guarantee of /. Moreover, for a fixed a, en- 
tries i with higher sensitivity Si{f) receive a worse privacy guarantee (higher e^). 

There is a natural tradeoff between the amount of noise added and the ac- 
curacy of the perturbed function /. To capture this, we introduce the notion of 
distortion between two (possibly randomized) functions: 

Definition 2 (Distortion). Given two functions / : I" — > M and / : I" ^ K, 
the distortion, S{f,f), between f and f is given by 

6{f,f) :=maxE[|/(d)-/(d)| 

del" L 

In our setup, the data analyst wishes to disclose an estimator function s : 
I" — > R of the linear predictor s. Intuitively, a good estimator s should have a 
small distortion d{s, s), while also providing good differential privacy guarantees. 
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2.2 Privacy Auction Mechanisms 

Each individual i G [n] has an associated cost function : M+ — ?► M+, which 
determines the cost Ci{ei) incurred by i when an (ei, . . . , e„)-differentially private 
estimate s is released by the analyst. As in [7], we consider linear cost functions, 
i.e., Ci(e) = Vie, for all i € [n]. We refer to Vi as the unit-cost of individual i. 
The unit-costs Vi are not a priori known to the data analyst. Without loss of 
generality, we assume throughout the paper that vi < . . . < Vn. 

Given a weight vector w = (wi, . . . ,Wn) G M", let Ms be a mechanism 
compensating individuals in [n] for their loss of privacy from the release of an 
estimate s of the linear predictor s(d). Formally, Ms takes as input a vector of 
reported unit-costs v = (ui, . . . , v„) G M" and a budget B, and outputs 

1. a payment pi G M_|_ for every i € [n], and 

2. an estimator function s : I" ^ R+. 

Assume that the estimator s satisfies (ei, . . . , e„)-diffcrcntial privacy. A mech- 
anism is budget feasible if — payments made by the mech- 
anism are within the budget B. Moreover, a mechanism is individually rational 
if for all i G [n], Pi > Ci{ei) = Viti, i.e., payments made by the mechanism 
exceed the cost incurred by individuals. Finally, a mechanism is truthful if for all 
i G [n], pi{vi,v-i) - Viei{vi,v-i) > pi{v'i,V-i) - Viti{v[,V-i), i.e., no individual 
can improve her utility by misreporting her private unit-cost. 

2.3 Outline of our approach 

We denote by Sm^ '■— ^{^^ the distortion between s and the function output by 
the mechanism Ms. Ideally, a mechanism should output an estimator that has 
small distortion. However, the smaller the distortion, the higher the privacy vio- 
lation and, hence, the more money the mechanism needs to spend. As such, the 
objective of this paper is to design a mechanism with minimal distortion, subject 
to the constraints of truthfulness, individual rationality, and budget feasibility. 

To address this question, in Section 3, we first establish a privacy-distortion 
tradeoff for differentially-private estimators of the linear predictor. We then in- 
troduce a family of estimators. Discrete Canonical Laplace Estimator Functions 
(DCLEFs), and show that they achieve a near-optimal privacy-distortion trade- 
off. This result allows us to limit our attention to DCLEF privacy auction mech- 
anisms, i.e., mechanisms that output a DCLEF s. In Section 4, we present a 
mechanism that is truthful, individually rational, and budget feasible, while also 
being near-optimal in terms of distortion. 

3 Privacy-Distortion Tradeoff and Laplace Estimators 

Recall that a good estimator should exhibit low distortion and simultaneously 
give good privacy guarantees. In this section, we establish the privacy-distortion 
tradeoff for differentially-private estimators of the linear predictor. Moreover, 
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we introduce a family of estimators that exhibits a near-optimal tradeoff be- 
tween privacy and distortion. This will motivate our focus on privacy auction 
mechanisms that output estimators from this class in Section 4. 

3.1 Privacy-Distortion Tradeoff 

There exists a natural tension between privacy and distortion, as highlighted by 
the following two examples. 

Example 1. Consider the estimator s := R^^^iWi, where recall that R = 
(-Rmin + -Rmax)/2. This estimator guarantees perfect privacy (i.e., = 0), for all 
individuals. However, S{s,s) = (VFZ\)-^/4. 

Example 2. Consider the estimator function s ;= J^^^i^i'^i- ^^^^ case, 
5(s, s) = 0. However, ei = oo for all i £ [n\. 

In order to formalize this tension between privacy and distortion, we define 
the privacy index of an estimator as follows. 

Definition 3 Let s : I" — > R 6e any (ei, . . . , differentially private estimator 
function for the linear predictor. We define the privacy index, /3(s), of s as 

(5(3) := max I w{H) : H C [n] and ^ e, < 1/2 i . (2) 
I ten J 

/3(s) captures the weight of the individuals that have been guaranteed good 
privacy by s. Next we characterize the impossibility of having an estimator with 
a low distortion but a high privacy index. Note that for Example 1, /3(s) = W, 
i.e., the largest value possible, while for Example 2, /3{s) = 0. We stress that the 
selection of 1/2 as an upper bound in (2) is arbitrary; Theorems 1 and 2 still 
hold if another value is used, though the constants involved will differ. 

Our first main result, which is proved in Appendix A, establishes a trade-off 
between the privacy index and the distortion of an estimator. 

Theorem 1 (Trade-off between Privacy-index and Distortion) Let < 

a < 1. Let I : I" ^ M 6e an arbitrary estimator function for the linear predictor. 
IfHs^s) < (aVKzi)V48 then /3(s) < 2aW. 

In other words, if an estimator has low distortion, the weight of individuals with 
a good privacy guarantee {i.e., a small e^) can be at most an a fraction of 2W. 

3.2 Laplace Estimator Functions 

Consider the following family of estimators for the linear predictor s : I" ^> M: 

n n 

s(d; a, X, a) := WidiXi + Wiai{l — Xi) + Lap(cr) (3) 

1=1 i=l 

where Xi € [0, 1], and each 6 R is a constant independent of the data vector 
d. This function family is parameterized by x, a and a. The estimator s results 
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from distorting s in two ways: (a) a randomized distortion by the addition of 
the Laplace noise, and (b) a deterministic distortion through a hnear interpola- 
tion between each entry di and some constant a^. Intuitively, the interpolation 
parameter Xi determines the extent to which the estimate s depends on entry 
di. Using Lemma 1 and the definition of distortion, it is easy to characterize the 
privacy and distortion properties of such estimators. 

Lemma 2 Given Wi, i G [n], let s(d) be the linear predictor given by (1), and 
s an estimator of s given by (3). Then, 

1. s is (ei, . . . , en) -differentially private, where Ci = ^' , i S [n]. 

2. The distortion satisfies 6{s, s) > ^2^=1 — Xif)^ + 2(T^, with equality 
attained when at = R, for all i G [n] . 

The proof of this lemma can be found in Appendix B. Note that the constants 
Ui do not affect the differential privacy properties of s. Moreover, among all 
estimators with given x, the distortion S{s, s) is minimized when ai = R for all 
t G [n]. In other words, to minimize the distortion without affecting privacy, it is 
always preferable to interpolate between di and R. This motivates us to define 
the family of Laplace estimator functions as follows. 

Definition 4 Given Wi, i G [n], the Laplace estimator function family (LEF) 
for the linear predictor s is the set of functions s : I" — )■ M, parameterized by x 
and a, such that 

n n 

s(d; X, a) = ^ WidiXi + - x,) + Lap{a) (4) 

i=l i=l 

We call a LEF discrete if Xi G {0, 1}. Furthermore, we call a LEF canonical 
if the Laplace noise added to the estimator has a parameter of the form 

a ^ a{x) := AJ2H\{1 - X,) (5) 

Recall that Xi controls the dependence of s on the entry di; thus, intuitively, 
the standard deviation of the noise added in a canonical Laplace estimator is 
proportional to the "residual weight" of data entries. Note that, by Lemma 2, 
the distortion of a canonical Laplace estimator s has the following simple form: 

q " q " 

6is, s) = ^A^{Y. I^'^KI - ^0)' = l^'iW - E l^'I^O'- (6) 

i=l 1=1 

Our next result establishes that there exists a discrete canonical Laplace 
estimator function (DCLEF) with a small distortion and a high privacy index. 

Theorem 2 (DCLEFs suffice) LetO<a<l.Let 

s* :— argmax /?(•?) 

s:S{s,s)<{aWA}^ /48 
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be an estimator with the highest privacy index among all s for which S{s, s) < 
{aWA)^/48. There exists a DCLEF s° : I" -^K. suchthat5{s,s°) < {9/4){aWA)^, 
andl3{s°) > 5/3(s*). 

In other words, there exists a DCLEF that is within a constant factor, in terms of 
both its distortion and its privacy index, from an optimal estimator s* . Theorem 
2 is proved in Appendix C and has the following immediate corollary: 

Corollary 1 Consider an arbitrary estimator s with distortion 6{s, s) < {WA)'^/48. 
Then, there exists a DCLEF s° such that 5{s,s°) < 108(5(s,s) and I3{s°) > i/3(s). 

Proof. Apply Theorem (2) with a = ■\/485(.s, s)/{WA). In particular, for this a 
and s as in the theorem statement, we have that s* := argmax^/.^^^, ^/)<5(s s) 
hence /3(s*) > /3(s). Therefore, there exists a DCLEF s° such that d{s,s°) < 
{9/4){aWA)^ < 108(5(s,s), and /3(s°) > ^P{s*) > i/3(s). 

Theorems 1 and 2 imply that, when searching for estimators with low distortion 
and high privacy index, it suffices (up to constant factors) to focus on DCLEFs. 
Similar results were derived in [7] for estimators of unweighted sums of bits. 

4 Privacy Auction Mechanism 

Motivated by Theorems 1 and 2, we design a truthful, individually rational, 
budget-feasible DCLEF mechanism (i.e., a mechanism that outputs a DCLEF) 
and show that it is 5-approximate in terms of accuracy compared with the op- 
timal, individually rational, budget-feasible DCLEF mechanism. Note that a 
DCLEF is fully determined by the vector x e {0, 1}". Therefore, we will simply 
refer to the output of the DCLEF mechanisms described below as (x, p), as the 
latter characterize the released estimator and the compensations to individuals. 

4.1 An Optimal DCLEF Mechanism 

Consider the problem of designing a DCLEF mechanism M that is individu- 
ally rational and budget feasible (but not necessarily truthful), and minimizes 
6m- Given a DCLEF s, define H{s) := {i : Xi = 1} to be the set of individ- 
uals that receive non-zero differential privacy guarantees. Eq. (6) implies that 
S{s,s) = ^A'^(W — w{H{s)))'^ . Thus, minimizing 6{s,s) is equivalent to maxi- 
mizing w{H{s)). Let (xopt, Popt) be an optimal solution to the following problem: 

n 

maximize 5(x;w) = Iw^jxi 
1=1 

subject to: pi > Uiei(x), Vi G [n], (individual rationality) 

n 

< B (budget feasibility) 

i=l 

Xi G {0,1}, Vi G [n] (discrete estimator function) 
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where, by Lemma 2 and (5), 




\W^\{1-X,) 



(canonical property). 



(8) 



A mechanism Mopt that outputs {:x.opt,Popt) will be an optimal, individually 
rational, budget feasible (but not necessarily truthful) DCLEF mechanism. Let 
OPT := S'(xopt;w) be the optimal objective value of (7). We use OPT as the 
benchmark to which we compare the (truthful) mechanism we design below. 
Without loss of generality, we make the following assumption: 

Assumption 5 For all i G [n], \wi\vi/{W — \wi\) < B. 

Observe that if an individual i violates this assumption, then Ci(ej;(x)) > B for 
any x output by a DCLEF mechanism that sets Xi = 1. In other words, no 
DCLEF mechanism (including Mopt) can compensate this individual within the 
analyst's budget and, hence, will set Xi = 0. Therefore, it suffices to focus on the 
subset of individuals for whom the assumption holds. 

4.2 A Truthful DCLEF Mechanism 

To highlight the challenge behind designing a truthful DCLEF mechanism, ob- 
serve that if the privacy guarantees were given by ei(x) = Xi rather than (8), the 
optimization problem (7) would be identical to the budget-constrained mecha- 
nism design problem for knapsack studied by Singer [12]. In the reverse- auction 
setting of [12], an auctioneer purchases items valued at fixed costs Vi by the 
individuals that sell them. Each item i is worth \wi\ to the auctioneer, while the 
auctioneer's budget is B. The goal of the auctioneer is to maximize the total 
worth of the purchased set of items, i.e., S'(x;w). Singer presents a truthful 
mechanism that is 6-approximate with respect to OPT. However, in our set- 
ting, the privacy guarantees ei(x) given by (8) introduce externalities into the 
auction. In contrast to [12], the Ci's couple the cost incurred by an individual i 
to the weight of other individuals that are compensated by the auction, mak- 
ing the mechanism design problem harder. This difficulty is overcome by our 
mechanism, which wc call FairlnncrProduct, described in Algorithm 1. 

The mechanism takes as input the budget B, the weight vector w, and the 
vector of unit-costs v, and outputs a set O C [n], that receive Xi = 1 in the 
DCLEF, as well as a set of payments for each individual in O. Our construction 
uses a greedy approach similar to the Knapsack mechanism in [12]. In particular, 
it identifies users that are the "cheapest" to purchase. To ensure truthfulness, 
it compensates them within budget based on the unit-cost of the last individual 
that was not included in the set of compensated users. As in greedy solutions to 
knapsack, this construction does not necessarily yield a constant approximation 
w.r.t. OPT; for that, the mechanism needs to sometimes compensate only the 
user with the highest absolute weight \wi\. In such cases, the payment of the 
user of the highest weight is selected so that she has no incentive to lie about 
here true unit cost. 
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Algorithm 1 FairInnerProduct(v, w, B) 

Let k be the largest integer such that ^^^^^ > . 
Let i* :— argmaXjg[„] 
Let p be as defined in (9). 
if > E.e[fei\{>.} then 
Set O = {i*}. 

Set Pi* — p and p; = for all i ^ i* . 
else 

Set O = [fc]. 

Pay each i £ O, pi = \wi\mm{:^jj^ , y^^^^^^}, and for i ^ O, pi = 0. 
end if 

Set Xi — 1 if i £ O and Xi = otherwise. 



Recall that vi < ... < Vn- The mechanism defines i* := argmaxjgj„] [wil 
as the individual with the largest \wi\, and k as the largest integer such that 
wUW) — w-w([k]) • Subsequently, the mechanism either sets Xi = 1 for the first 
k individuals, or, if \wi'\ > J2i^[k]\{i*} l*^*!' ^^^^ ~ 1- ^^'^ former case, 
individuals i G [k] are compensated in proportion to their absolute weights \wi\. 
If, on the other hand, only Xi' = 1, the individual i* receives a payment p defined 
as follows: Let 

■.={te[n]\{i*} : > TT—^ andV |m,| > m}. 

If S-i* ^ 0, then let r := min{z : i G S-i*}. Define 

/ B, if = 

P ■= 1 I^'TI"'-, , otherwise 

The next theorem states that FairlnncrProduct has the properties we desire. 

Theorem 3 FairlnnerProduct is truthful, individually rational and budget fea- 
sible. It is 5 -approximate with respect to OPT . Further, it is 2- approximate when 
all weights are equal. 

The theorem is proved in Appendix D. We note that the truthfulness of the knap- 
sack mechanism in [12] is established via Myerson's characterization of truthful 
single-parameter auctions (i.e., by showing that the allocation is monotone and 
the payments are threshold). In contrast, because of the coupling of costs induced 
by the Laplace noise in DCLEFs, we are unable to use Myerson's characterization 
and, instead, give a direct argument about truthfulness. 

We prove a 5-approximation by using the optimal solution of the fractional 
relaxation of (7). This technique can also be used to show that the knapsack 
mechanism in [12] is 5-approximate instead of 6-approximate. FairlnnerProduct 
generalizes the Ghosh-Roth mechanism; in the special case when all weights are 
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equal FairlnnerProduct reduces to the Ghosh-Roth mechanism, which, by Theo- 
rem 3, is 2-approximate with respect to OPT . In fact, our next theorem, proved 
in Appendix E, states that the approximation ratio of a truthful mechanism is 
lower-bounded by 2. 

Theorem 4 (Hardness of Approximation) For all e > 0, there is no truth- 
ful, individually rational, budget feasible DCLEF mechanism that is also 2 — e- 
approximate with respect to OPT. 

Our benchmark OPT is stricter than that used in [7]. In particular, Ghosh and 
Roth show that their mechanism is optimal among all truthful, individually 
rational, budget-feasible, and envy-free mechanisms. In fact, the example we use 
to show hardness of approximation is a uniform weight example, implying that 
the lower-bound also holds for uniform weight case. Indeed, the mechanism in [7] 
is 2-approximate with respect to OPT, although it is optimal among individually 
rational, budget feasible mechanisms that are also truthful and envy free. 

5 Discussion on Linear Predictors 

As discussed in the introduction, a statistic s(d) of the form (1) can be viewed as 
a linear predictor and is thus of particular interest in the context of recommcnder 
systems. We elaborate on this interpretation in this section. Assume that each 
individual i G [n] = {1, . . . ,n} is endowed with a public vector G K™, which 
includes m publicly known features about this individual. These could be, for 
example, demographic information such as age, gender or zip code, that the 
individual discloses in a public online profile. Note that, though features are 
public, the data di is perceived as private. 

Let Y = [yi]ig[„] G rix™ g, matrix comprising public feature vectors. 
Consider a new individual, not belonging to the database, whose public feature 
profile is y G M™. Having access to Y, d. and y, the data analyst wishes to 
release a prediction for the unknown value d for this new individual. Below, we 
give several examples where this prediction takes the form s(d) = (w, d), for 
some w = w(y, Y). All examples are textbook inference examples; we refer the 
interested reader to. for example, [22] for details. 

k-Nearest Neighbors. In /c-Nearest Neighbors prediction, the feature space 
is endowed with a distance metric {e.g., the ^2 norm), and the predicted 
value is given by an average among the k nearest neighbors of the feature vector 
y of the new individual. I.e., s(d) = -1 X]ieAAfc(y) where Nk{y) C [n] comprises 
the k individuals whose feature vectors yi are closest to y. 

Nadaranya- Watson Weighted Average. The Nadaranya- Watson weighted av- 
erage leverages all data in the database, weighing more highly data closer to y. 
The general form of the prediction is 5s(d) = Yl"=i K(.y'yi)di/ J^^i^i K{y,yi) 
where the kernel K : M™ x R™ — > R+ is a function decreasing in the distance 

II ' II 2 

between its argument {e.g., K{y,y') — e^"^^^ H ). 
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Ridge Regression. In ridge regression, the analyst first fits a linear model to 
the data, i.e., solves the optimization problem 

minb«". Eti {d^ - (y-b))' + A||b||2, (10) 

where A > is a regularization parameter, enforcing that the vector b takes small 
values. The prediction is then given by the inner product (y, b). The solution to 
(10) is given by b = (Y-^Y + AI)~^Y^d; as such, the predicted value for a new 
user with feature vector y is given by s(d) = (y, b) = y-^(Y-'"Y + AI)^^Y-^d. 

Support Vector Machines. A more general regression model assumes that the 
private values di can be expressed in terms of the public vectors y^ as a linear 
combination of a set of basis functions : K™ — R, t = 1,...,L, i.e., the 
analyst first solves the optimization problem 

miiibsR^ ELi {dt - Efci behi{yt)y + X\\h\\l (11) 

For y,y' G M"', denote by _ft'(y,y') = J2i=i ht{y)hi{y) the kernel of the space 
spanned by the basis functions. Let K(Y) = [^(yi, yj)]i,jG[n] G R"^" be the 
nxn matrix comprising the kernel values evaluated at each pair of feature vectors 
in the database, and k(y, Y) = [-fl'(y, yi)]iG[„] £ R" the kernel values w.r.t. the 
new user. The solution to (11) yields a predicted value for the new individual of 
the form: s(d) = (k(y, Y))^(K(r) + AI)-id. 

In all four examples, the prediction s(d) is indeed of the form (1). Note that 
the weights arc non-negative in the first two examples, but may assume negative 
values in the latter two. 

6 Conclusion and Future Work 

We considered the setting of an auction, where a data analyst wishes to buy, 
from a set of n individuals, the right to use their private data G R, i G [n], 
in order to cheaply obtain an accurate estimate of a statistic. Motivated by 
recommender systems and, more generally, prediction problems, the statistic we 
consider is a linear predictor with publicly known weights. The statistic can be 
viewed as a prediction of the unknown data of a new individual based on the 
database entries. We formalized the trade-off between privacy and accuracy in 
this setting; we showed that obtaining an accurate estimate necessitates giving 
poor differential privacy guarantees to individuals whose cumulative weight is 
large. We showed that DCLEF estimators achieve an order-optimal trade-off 
between privacy and accuracy, and, consequently, it suffices to focus on DCLEF 
mechanisms. We use this observation to design a truthful, individually rational, 
budget feasible mechanism under the constraint that the analyst has a fixed 
budget. Our mechanism can be viewed as a proportional-purchase mechanism, 
i.e., the privacy guaranteed by the mechanism to individual i is proportional 
to her weight \wi\. We show that our mechanism is 5-approximate in terms of 
accuracy compared to an optimal (possibly non-truthful) mechanism, and that 
no truthful mechanism can achieve a 2 — e approximation, for any e > 0. 
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Our work is the first studying privacy auctions for asymmetric statistics, and 
can be extended in a number of directions. An interesting direction to investigate 
is characterizing tlie most general class of statistics for which truthful privacy 
auctions that achieve order-optimal accuracy can be designed. An orthogonal 
direction is to study the release of asymmetric statistics in other settings such 
as (a) using a different notion of privacy, (b) allowing costs to be correlated 
with the data values, and (c) survey-type settings where individuals first decide 
whether to participate and then reveal their private data. 
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A Proof of Theorem 1 (Trade-off between Privacy-index 
and Distortion) 

By Definition 3, the privacy index /3(s) for an estimator s is the optimal objec- 
tive value of the following optimization problem: maximize X]"=i l^il^^j where 

Interpreting \wi \ as the value, and as the size of object i, the above prob- 
lem can be viewed as a 0/1 knapsack problem where the size of the knap- 
sack is 1/2. Assume for this proof, without loss of generality, that |^ < 
■ • ■ < luf7' define some notation that is needed in the proof. Let h{s) := 

max \ j e \n\ : < „ }, ... > if < t^t—i and h(s) := otherwise. Observe 

l^J ^ L J 2w{[j]) j \wi\ 2\wi\ \ ' 

that < h{s) < n. Next, define 



argmax \wi\, and H{s) := 

i6[n]:ei<l/2 



[h{s)], ifw{[his)]) > \w.\, 
{i}, otherwise. 



The following then holds. 
Lemma 3 2w{H{s)) > /3(s). 

Proof. H(s) is a 2-approximate greedy solution to the 0/1 knapsack problem 
given by [2:!, Section 2.4]. 

Now we are ready to prove that if the distortion 6{s, s) is small, then w{H{s)) 
is also small, which, together with Lemma 3, proves the theorem. In our proof, we 
make use of the notion of fc-accuracy defined in [7, Definition 2.6] . For s : I" — > R, 
let 

ks := minjfc G R+ : Vd £ r,P[|s(d) - s(d)| > fc] < (12) 

Lemma 4 Let < a < 1. Ifw{H{s)) > aW then ks > aWA/A. 

Proof. Assume for the sake of contradiction that w{H{s)) > aW and kg < 
aWA/A. For a data vector d, let z = s(d) = "^^Widi and z s(d). Also, let 
S ~ {y & n:\y ~ z\< kg}. Then, by (12), P[z G 5] > 2/3. 
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The set H{s) can be partitioned as follows: H{s) = H'^{s) U H (s), with 
H^{s) n H^{s) = {0}, where the disjoint subsets H^{s) and H^{s) are defined 

by 

H^{s) = {i E [n] : di < R and wi < 0} U {i £ [n] : di > R and Wi > 0}, 
-^^^(■s) = {« e W : < and Wi > 0} U {i e [n] : d^ > R and < 0}. '"^'^^ 

Then w{H{s)) = w{H+{s)) + w{H-{s)). Thus, one of the subsets H+{s) and 
H~{s) must have a total weight greater or equal to w{H{s))/2. Without loss of 
generality, assume that w{H^{s)) > w{H{s))/2. 

Consider another data vector d' where = if i e [n] \ H^{s), while if 
ieH+{s), 

_ ( di + if di < R and wt < , , 

'~ [d^-f, iid,>R and > ^ ' 

Let z' := s(d') = X^^^i and let z' ^ s(d'). Also, let S" := {y £ R : \y-z'\ < 
kg}. From cq. (14), we have 

\z-z'\^\ ^^i;.(d,-<)| = | Y.\w,\A/2\=.^w{H+{s))>^w{H{s))>a^W. 

ieH+(s) ieH+{s) 

(15) 

Since kg < aWA/A, cq. (15) implies that S and S" arc disjoint. 

Since s is (ei, . . . , e„)-difFcrcntially private, and d and d' differ in exactly the 

entries in iJ+(s), P[z' e S] > cxp (- Y.^eH+is) & S] > cxp (- EreHHs) 

Note that E^elh{s)] ^^ < Y.^elHs)] 2^([I'('s)]) = h and also < 1/2. Therefore, 
J2ieHis)<^i < 1/2- Since i?+(s) C H{s), we have Ei6if+(s)e« < < 
1/2. 

This implies P[z' e 5] > exp (- E^eH^s) ^i) I > ^xp (-5) I = 37S > i 
Given that 5 and S" are disjoint, P[z' G 5] > 1/3 implies that P[z' ^ 5'] > 1/3, 
which contradicts the assumption that kg < aWA/A. 

Next we relate fc^-accuracy to the distortion (5(.s, J): 



Lemma 5 For s{d) as defined in (1) and a Junction s : I" ^ K, fc^ < yji5{s,s). 

Proof. Observe that for all fc > v'3(5(s, s), P[|s(d)-s(d)| > fc] < P[|s(d)-s(d)| > 
\/3^(M)] < ^'l''^3'y~y^'^^l'' < i where the second step follows from Markov's 
inequality. This implies kg < ■\/3(5(s, s). 

Corollary 2 If w{H{s)) > aW then S{s,s) > (aWAy/iS. 

Proof. The corollary follows from Lemma 4 and Lemma 5. 

Thus from Corollary 2, we have that if S{s, s) < {aWA)^/48, then w{H{s)) < 
aW. Since w {H{s)) > i/3(s) (from Lemma 3), it implies if5(s,s) < {aWA)'^/A8, 
then i/3(s) < aW. This concludes the proof of Theorem 1. □ 
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B Proof of Lemma 2 

For the first part of tliis lemma, observe that the sensitivity of Wi[xidi + (1 — 
Xi)ai] w.r.t. i is Si{s) — A\wi\xi. The differential privacy guarantee therefore 
follows from Lemma 1. 

To obtain the lower bound on the distortion, observe that substituting the 
expressions for s and s in the expression for 5(s, s), we get 

6{s, s) — maxE[|s(d) — s(d; a, x, cr)p] 
del" 



= maxE[( Widi{l — Xi) — 'Wiai[\ — Xi) — z)^] (where z Lap(o')) 

'^'^^ 1=1 i=l 

n 

= max(^ w,(l - x.i){d, - ai)f + 2(7^ (since E[z] = 0;E[z2] = 2(7^) 
i=i 

= 2o-2 + max ( ^ 7i(di - Oj)) = 20-^ + ( max | ^ ^.i{di - a.j) |) 

i=l 2=1 

Observe that maxjei" |/(d)| = max|| maxdei" /(d) |, | mindei" /(d) |} for any 
continuous function / : I" —5- M. Therefore, 

n n 

5{s, s) = 2o-2 + (max {| max^ 7i((ij - ai)|, | min ^7i(di - ai)|}) 

1=1 4=1 

n n 
= 2^2 + (max{|7(+)i?,„ax + 7^"^^min - ^7ia4|, |7^"^^^min + 7^"^^max ~ ^Z'^^"^!))^ 



4=1 4=1 

where 7^+) := J22.-fi>Q 7i, and 7^") := I]i^7i<o 7i- Observe that, for any a, fo, c, G 
R, it is true that max(|a — c|, |6 — c|) > with equality attained at c = ^y^. 
Applying this for a = 7*^-'^max + 7'-"''^min, b = 7(+^i?min + 7'-"''-Rmax and 
c = Er=i74a4 we get mm^eR'^ 6(8, s) > 2a^ + (7+-r-)(fi..a.-Hn„n) ^ + 

(t Sr=i |wi|(l-Xi)) , with equality attained when J2i JiO-i = (7^ + 7")(-Rmax + ^min)/2 = 
J2i liRi which holds for = R. 



C Proof of Theorem 2 (DCLEFs Suffice) 

Consider the function s°(d) := J2i^H° ~^ ^J^ieH" ''^i + Lap{w{H°)), where 
H° is defined as H° := argmax {?ii(7J) : H C [n] and w{H) < aW} . We can 
write s° as s°(d;x) := X]"=i ^idiXi + -RX]"=i ^4(1 ~ 2:^) + Lap(?i'(7J°)), where 
~ for all i G iJ° and Xi = 1 otherwise. Observe that s° is a DCLEF and 

5(s,s*) < (al^Z\)2/48, it follows from Lemma 5 that ks- < aWA/A. Then, it 
follows from Lemma 4 that w{H{s*)) < aW, where H{s*) is as defined in the 
proof of Theorem 1. Further, it follows that w{H°) > w{H{s*)) > i/3(s*) (the 
first inequality follows by definition of H° and the fact that w{H{s*)) < aW, and 
the second from Lemma 3). Since /3(s°) > w{H°), it follows that /3(s°) > i/3(s*). 
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D Proof of Theorem 3 



D.l Truthfulness, Individual Rationality, and Budget Feasibility 

In this section, we prove that FairlnnerProduct is truthful, individually rational, 
and budget feasible. We first define 

5i:=|te[n]\{**}:^^ ^ , , > ' 



and 



S2■.= lte[n]\{^*}■. J2 k«l>K*l 
[ «e[t]\{i*} 

Observe that S-t* = n S'2. 

Proposition 1 FairlnnerProduct is budget feasible. 

Proof. When O = {i*} and p = B, the mechanism is trivially budget feasible. If 
P = J^"^!*!^"^ I then observe that since r £ S^i*, this implies r £ Si and r £ S2- 
Therefore, p = Jr' i J."'' 1 < w .^l™'*!"'' — , — r < — l^'i'l^ — ^ ^ where the 

second inequality holds because r € Si and the last inequality because r e 
When O = [fc], the sum of the payments made by the mechanism is given by 

J2i<kPt ^ Si<fc l""^'! «,([fe]) — w{[k]) SjXfc l^il = B. 

Proposition 2 If i* > k + 1 and \wi» \ > J2ie[k]\{i'} ^-i' = ^• 

Proof. Observe that if z* > fc + 1 and \wi' \ > J2i£[k]\{i-'} I'^il^ then Si = [fc] and 

^2 n [fc] = 0. 

Proposition 3 // \wi. \ > J2ie[k]\{i'} ^''^'^ ^-i' 7^ ^' ^'^^^ r > i* . 

Proof. From Proposition 2, S-i* ^ implies either i* < fc + 1 or jw;. | < 
X]ie[fe]\{i*} '^i- Since the latter is false, it must be that i* < fc + 1. In that 
case, 52 n [fc] 0. Therefore r > fc. If i* = fc + 1, then for all j £82, j > k + 2. 
Therefore r > fc + 2. 

Proposition 4 FairlnnerProduct is individually rational. 
Proof. We divide the proof into two cases: 

Case I: O = [fc]. We know that B/wi[k]) > Vk/{W - ^([fc])) (by construc- 
tion) and Vk+i > Vk (by definition). Therefore, for all i < k, pi > ^J^.'JjQfcjj > 

W-w([k]) ~ CiVEij- 

Case II: O = {«*}. If Pi* = B, then the mechanism is individually rational by 
Assumption 5. If pi» = ^j,"^!*!^"^ | , then, by Proposition 3, Vr > Vi* and therefore 
the mechanism is individually rational. □ 
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Proposition 5 FairlnnerProduct is dominant- strategy truthful. 



Proof. Fix any v and assume that user i reports a value z ^ Vi^ while the 
remaining values v_i remain the same. Let u be the resulting vector of values, 
i.e., Ui = z and Uj = Wj, for j ^ i. The vector u induces a new ordering of 
the users in terms of their reported values Ui, i € [n]; let tt : [n] — > [n] be the 
permutation indicating the position of users under the new ordering. That is, 
TT is 1-1 and onto such that if Uj < Uji then 7r(j) < Tr{j'), for all j,j' S [n]. 
For given j G [n] , we denote the set of users preceding j under this ordering by 
Pj = {/ : < 7r(j)}. Note that al / G Pj satisfy u'j < uj. Observe that if 

z > Vi then 



' w{[j]), for all j < i 

^{[j]) - for all j > i s.t. 7r(j) < 7r(i) 

w{[i]) + w{{£ :£> iA n{£) < 7r(i)}), for j ^ i 
,w{[i]), for allj > i s.t. 7r(j) > 7r(i) 

(16) 



while if z < Vj then 



[[j]), for ah j < i s.t. 7r(j) < 7r(i) 

wi[i]) - w{{i : £ <i A Tr{£) > 7r(i)}), for j = i 

+ for all j < i s.t. 7r(j) > 7r(i) 

for all j > i 

(17) 

Let - {i e [n] : ^ > w^^^} where 1^ = «;([n]). Then, by (16), if 

z > Vi then w{Pj) < w{[j]) for j ^ i while u'(Pi) > w([«]). As a result, if z > f^, 
then 

j ^ i,\f j e [k], then j e Myr (18a) 
if i<^[k], then i ^ Af^ (18b) 

Similarly, from (17), if z < Vi, then 

for j ^ i, if j ^ [fc], then j ^ M„ (19a) 
if i e [fc], then i e (19b) 

Observe that, given the value vector u, the mechanism will output O-,^ ~ {**}; 
if > w{Mt, \ {i*}), and = otherwise. If = M^r, users j g Af^r are 

compensated by pj min | ^^^^^^ , | . If 0,^ = {«*}, the latter is 

compensated by p given by (9). We consider the following cases: 

Case I: = Af^. If i ^ Af^, then Pi = ti = 0, so since FairlnnerProduct is 

individually rational, i has no incentive to report z. Suppose thus that i G M^. 
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We consider the following subcases: 

Case 1(a): i <^ [fc]. Then Vi > Vk+i- Since i G Af^ but i ^ [k], (18) implies that 
z < V,. By (19) fc + 1 ^ M^. Thus p, < \w,\vk+i/w{A'U) < \wi\v^/w{M^). 
Case 1(b): i € [k]. We will first show that Af^ \ [k] = 0. Suppose, for the sake of 
contradiction, that \[k] 7^ 0. Then A/tt \ [k] must contain an element different 
than i; this, along with (19) implies that z > Uj. If 7r(i) < 7T{k + 1), then by (16) 
w{Pj) = w{[j]) and j ^ A/^r for all j > fc + 1, which contradicts that Af^ \ [k] is 
non-empty. Hence, 7r(i) > 7r(fc + l); this however implies that w{Pi) > w{[k + l]), 
by (16), and that z > Vk+i- Thus ^ < ^^^^^ < ^ w-w(P,) ^ so 

i ^ A^TT, a contradiction. Hence Af^ \ [fc] = 0. 

Next we will show that the original output O = [k]. Suppose, for the sake of 
contradiction, that O = {«*}. Then \wi» \ > w{[k] \ {i*}) while \wi* \ < w{M.,^ \ 
{i*}). Thus, Af^ \ [k] 7^ 0, a contradiction. Thus, O = [fc]. 

If = ^^TT = [k], then since O = [k], user i receives the same payoff, so 
it has no incentive to report z. Suppose that Af^ 7^ [k]. Since Af^r \ [k] = 0, it 
must be that [k] \ Mj^ ^ 0. By (18), this implies z < Vi. li i < k, (17) implies 
that k £ A'/tt and so do all j s.t. 7r(j) < 7r(fc). Thus, [k] ~ Mt^, a contradiction. 
If I = fc and z < Vi, then it is possible that j ^ Af^ for some j < k. Thus, 
Pi ^ il(M '') = !^{m''^) ^^'^ * incentive to report z. 

Case II. Ott = {«*}• li i ^ i* , then i's payoff is obviously zero, so it has no 
incentive to report z. Suppose thus that i ^ i* . We consider the following two 
subcases. 

Case 11(a). O = {«*}. Observe that S-i* and p do not depend on Vi* . Thus, 
since O = {i*}, i receives the same payment p, so it has no incentive to misreport 
its value. 

Case 11(b) O = [k]. Then \w,,\ < w{[k] \ {i*}) while \w,, \ > w{M^ \ {i*}). 
Thus, [k] \ A/tt must contain an element different than i* . From (18), this implies 
that z < Vi. li i < k, (17) implies that k € Af^r and so do all j s.t. Tr{j) < ■n{k). 
Thus, [fc] = A/tt, a contradiction. 

Assume thus that i > k. Then Vi > Vk- Let j* = k ii i > k and j* = fc — 1 if 
i = k. Observe that j* G S-i*: indeed, it is in Si since i G [k], by the definition 
of k, and it is in S2 because \wi* \ < w{[k] \ {i*}). Hence p < ^^^^ < ^-"(m I - 
so z's payoff is at most zero, so it has no incentive to misreport its value. 

D.2 Approximation Ratio 

In this section we prove that FairlnnerProduct is 5-approximate with respect to 
OPT. 



Optimal Continuous Canonical Laplace Mechanism We first characterize 
an individually rational, budget feasible, continuous canonical Laplace mecha- 
nism that has optimal distortion. Consider the fractional relaxation of (7). 

n 

maximize ^^|i(Ji|a;i (20a) 

i=l 
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subject to Pi > Ci{ti) — Vi€i{x.), Mi G [n\ (20b) 

n 

Y^P^^B (20c) 

1=1 

< < 1, Vi G [n] (20d) 

where ei(x) = ^ [[^'[('i-a] ) • ^ budget feasible, individually rational, (but not 
necessarily discrete or truthful) canonical Laplace mechanism for the inner pro- 
duct has a minimal distortion among all such mechanisms if given input (v, w, B) 
it outputs (x*, p*), where the latter constitute an optimal solution to the above 
problem. This characterization will yield the approximation guarantee of the 
DCLEF mechanism^. 

Lemma 6 Recall that fi < W2 < • ■ • < For < k < n, define p{k) := 
X]r=fc+i 1^*1' ^ ^ 71—1, andp{n) := 0. ForO < k < n, define q{0) 0, and 
'■— X)i=i if^ k <n. Define I :~ min {k : Vi > k, q{i) — Bp(i) > 0} 

and let 

1, tfi<£ 

= andp* =v,\w,\x*/a{x*) i e [n]. 

0, ifi>£+l 

Then (x*,p*) is an optimal solution to (20). 

Proof. We show first that the quantities £ and x* are well defined. For p{i), 
q{i), i G {0,...,n}, as defined in the statement of the theorem, observe that 
g{i) ~ q{i) — Bp{i) is strictly increasing and that g{0) < while g{n) > 0. 
Hence, £ is well defined; in particular, £ < n ^ 1. The monotonicity of g implies 
that g{i) < for all < i < £ and g{i) > for i > £. For a £ [0,1], let 
h{a) = q{£) + Vi+i\wt+i\a - B{p{£ + 1) + |u.^+i|(l - a)). Then /i(0) = g{£) < 
and h{\) = g{£ + 1) > 0. As h{a) is continuous and strictly increasing in the 
reals, there exists a unique a* G [0, 1] s.t. h{a) — 0; since h is linear, it is easy to 
verify that a* ~ q{£) — Bp{£)/{vi + B)\we+i\ = x'^^^ and, hence, x'^^^ G [0,1]. 
To solve (20), we need only consider cases for which constraint (20b) is tight, 
i.e., Pi = Viei{x.). Any solution for which (20b) is not tight can be converted to 
a solution where it is; this will only strengthen constraint (20c), and will not 
affect the objective. Thus, (20) is equivalent to: 



Max. F{x.) = Y\w,\xi (21a) 

i=l 

n n 

subj. to Yv,\wi\x, - B^w^il - X,) <0, xg[0,1]" (21b) 



An analogous characterization of the budget-limited knapsack mechanism in [12] can 
be used to show that the mechanism is 5-approximate instead of 6-approximate. 
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It thus suffices to show that x* is an optimal solution to (21). The latter is 
a linear program and its Lagrangian is 



2—1 i—1 2 — 1 i — 1 

It is easy to verify that x* satisfies the KKT conditions of (21) with A* = , 

* -t1 Ve^.-i—Vi II J =^ -n ^i— ^£4-1 I 

- l(2<c) ■ ^,;,+b I"^»I: and ly, = l(2>f+i) • 

A canonical Laplace mechanism that outputs (x* , p* ) given by Lemma 6 
would be optimal. Moreover, the objective value S{x*;w) > OPT. 

Proposition 6 Let £ be as is defined in Lemma 6, and k as defined in Fairln- 
nerProduct. Then, £ > k. 

Proof. Assume that £ < k. Then 

e+i k 

B{W~w{\k])) < B{W-w{[£+l])) < J2 < l^'l^' - "fc 11^' = Vkw{[k]). 

i—1 i—1 i<k 

However, this contradicts the fact that B/w{k) > Vk/{W — w{[k])). 

Proposition 7 Let {x*} and £ be as defined in Lemma 6, and k as defined in 
Fair Inner Product. Then, w([fc + 1]) > X]i=fc+i l^il^^*- 

Proof, li £ = k, the statement is trivially true. Consider thus the case £ > k. 
Assume that X^i^i — 12i=k+i l^d^t • Then, 

B{W - w{[k + 1])) ^ B{W - Ef^fc+i k.l^n ^ B{W - \w^\xt) 

^ Li=fc+1 \w^\ViX, ^ ^^^^ 

since Vk+i < Vi for all {k + 1) < i < £. However, this contradicts the fact that 
B/wi[k + 1]) < Vk+i/iW - wi[k + 1])). 

Now we will show that S{x;w) > ^OPT using Proposition 7. First notice 
that since (20) is a relaxation of (7), OPT < S{x*;w), where {x*} are de- 
fined in Lemma 6. Therefore, we have that OPT < S{x*;w) = J2i<k + 

f),-i Prop. 7 

Ei=fc+i I^'jK < ^(M) + w{[k + 1]) < 2w{[k]) + \wt.\ It follows that if 
O = [fc], it implies w{[k]) > \w,* \ and therefore u;([fc]) = S'(x; w) > ^OPT. 
On the other hand, if O = {i*}, then \wi* \ > J2j£[k+i]\{i*} l^jl' which implies 
2m;,. > w{[k]). Therefore, OPT < 2w{[k]) + < 5|w,.| = 55'(x;w). 
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D.3 The Uniform- Weight Case 



In this section, we prove that when all weights are equal, FairlnnerProduct is 
2-approximate with respect to OPT. 

Let \wi \ = u for all i £ [n\. First, observe that in this case, FairlnnerProduct 
always outputs O = [k]. Therefore, S'(x;w) = ku. We use this observation to 
prove the result. 

Lemma 7 Assume that for all i G [n], = u. Then, S'(x; w) > ^OPT. 

Proof. Observe that OPT < S'(x*; w) = J^^tl \w^\x* = w{[k])+J2'itl+i \m\x* < 
w{[k]) + w{[k+l]) from Proposition 7, where {x*} and £ are defined in Lemma 6. 
Substituting \wi\ = u for all i, we get OPT < {2k + l)u Since OPT is the ob- 
jective value attained by the optimal DCLEF mechanism, OPT = rnu for some 
m S [n\. This implies 2A: + 1 > m. Since k and m are integers, it follows that 
2k > m, or equivalently, S'(x;w) > ^OPT. 

E Proof of Theorem 4 (Hardness of Approximation) 

Consider the following example. Let n = 4. The private costs of the four indi- 
viduals are given by vi = a,V2 — v-^ = V4 = 2, where < a < 2. The weights of 
the four individuals are given hy wi = W2 ^ W3 — w = where d > 0. Let the 
budget B = l + a/2 <2. 

Observe that the optimal individually rational, budget-feasible, DCLEF mech- 
anism would set xl = 1 and exactly one of X2,xl and to 1. Without loss of 
generality, assume that x'l = x'2 ^ 1 and a;^ = 2:4 = 0. Therefore, the optimal 
weight OPT = 2d. Consider a truthful DCLEF mechanism that is 2 — e approx- 
imate, for any e > 0. Such a mechanism must set xi = 1 (since it is truthful) 
and at least one more Xi to 1 (since it is 2 — e approximate). Therefore, for such 
a mechanism cr(x) < 2d. This implies that for such a mechanism, the cost of 
individual 1, ci(ei) = viwi/a{x.) > vid/{2d) > v\j2. Since the mechanism is 
truthful, the payment pi cannot depend on v\. Also, for this mechanism to be 
individually rational, pi must be at least 1 (since v\ can be arbitrarily close to 
2), which implies that the remaining budget is strictly less than 1. However, for 
this mechanism, for i G {2,3,4}, Ciiti) = 2d/a{'x.) > 1. This means that this 
mechanism cannot be both individually rational and budget feasible. □ 
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